library("dplyr")
library("MCMCpack")
library("LaplacesDemon")
library("rstan")
library("plotrix")
library(RColorBrewer)
library(corrplot)
library(factoextra)To Do:
In this task, participants have to evaluate whether a certain quantifier logically describes a scenario. Scenarios are percentages of, for example, apples that are green. Say, 20% of apples are green, is the statement that “most of the apples are green” true or false? Percentages are drawn from a uniform ensuring that as many trials have more than 50% as have less than 50%. The quantifiers of interest are Few, Fewer than half, Many, Most and More than half. The idea is that More than half is defined by more than 50% for everyone while Most may be more subjective, and it may be more than More than half for everyone. Additionally, we may inspect the relationship between Many and More than half, and also symmetries between mirrored quantifiers.
I think we could use a probit regression with a participant-specific quantifier effect and a participant-specific effect of percentage used. This way we could at least assess whether Most is more than More than half by defining the quantifier effect as a contrast between the two. It could even be tested in this framework whether More than half is 50% for everyone.
Disclaimer: I have tweaked the model yesterday to accommodate the added quantifier Many and the model comparison I planned. Therefore, this section probably needs to be updated slightly.
Let \(i\) indicate participants, \(i = 1, \ldots, I\), \(j\) indicate the quantifier, \(j = 1, \ldots, 5\), and \(k\) indicate the trial for each quantifier, \(k = 1, \ldots, K_{ij}\).1 Then \(Y_{ijk}\) is the \(i\)th participant’s response to the \(j\)th quantifier in the \(k\)th trial, and \(Y_{ijk} = 1\) if participants indicate true, and \(Y_{ijk} = 0\) if participant indicate false. Then, we may model \(Y_{ijk}\) as a Bernoulli, using the probit link function on the probabilities:
\[\begin{align*} Y_{ijk} &\sim \mbox{Bernoulli}(\pi_{ijk}),\\ \mu_{ijk} &= \Phi^{-1}(\pi_{ijk}),\\ \end{align*}\]
where the second line maps the probability space of \(\pi\) onto the real space of \(\mu\). We may now place a linear model on \(\mu_{ijk}\):
\[\mu_{ijk} = \beta_{i0} + l_{ijk}\beta_{i1} + h_{ijk}\beta_{i2} + z_{i1} \beta_{i3} + + z_{i2} \beta_{i4} + + z_{i3} \beta_{i5} + + z_{i4} \beta_{i6} + z_{i5} \beta_{i7},\]
where \(l_{ijk}\), \(h_{ijk}\) and \(z_{ij}\) are predictors: \(l_{ijk}\) is zero for percentages above 50%, and indicates the centered percentage otherwise; \(xh{ijk}\) is zero for percentages below 50%, and indicates the centered percentage otherwise, and \(z_{ij}\) is an indicator function denoting the respective quantifier (e.g., \(z_{i1} = 1\) for the quantifier Fewer, and \(z_{i1} = 0\) otherwise). Parameters \(\beta_{i0}\) are random intercepts, \(\beta_{i1}\) are random percentage effects for percentages below 50%, \(\beta_{i12}\) are random percentage effects for percentages above 50%, and \(\beta_{i3}\) to \(\beta_{i7}\) are random quantifier effects. For now, I will place the following priors:
\[\begin{align*} \beta_{i0} &\sim \mbox{Normal}(0, \sigma_0^2),\\ \beta_{i1} &\sim \mbox{Normal}(\delta_1, \sigma_1^2),\\ \beta_{i2} &\sim \mbox{Normal}(\delta_2, \sigma_2^2),\\ \beta_{i3} &\sim \mbox{Normal}(\delta_3, \sigma_3^2).\\ \beta_{i4} &\sim \mbox{Normal}(\delta_4, \sigma_4^2).\\ \beta_{i5} &\sim \mbox{Normal}(\delta_5, \sigma_5^2).\\ \beta_{i6} &\sim \mbox{Normal}(\delta_6, \sigma_6^2).\\ \beta_{i7} &\sim \mbox{Normal}(\delta_7, \sigma_7^2).\\ \end{align*}\]
Needed: Proper description of prior settings.
exclude <- c(0, 1, 5, 18, 32, 39, 49, 52, 53, 58, 62, 81, 82, 83, 85, 86, 87, 27, 76)
# 0,1,5,39,18,32,49,52,53,58,62,81,82,83,85,86,87,76
exclude2 <- c(27, 76)
dat <- read.csv("data/exp1-replication-trials.csv")
head(dat)## workerid A B percent read_and_decide_time quant read_time_one
## 1 0 "glerb" "fizzda" 20 3702 "All" 183
## 2 0 "thonk" "krangly" 82 433 "Some" 264
## 3 0 "slarm" "briddle" 11 287 "None" 216
## 4 0 "klong" "nooty" 62 16 "All" 48
## 5 0 "dring" "larfy" 28 384 "None" 113
## 6 0 "floom" "plerful" 92 126 "Some" 104
## response
## 1 false
## 2 false
## 3 true
## 4 true
## 5 false
## 6 false
## [1] 23220
# Exclusion based on Sonia's code
## Participant-level
dat <- subset(dat, !(workerid %in% exclude))
nrow(dat)## [1] 18318
## Trial-level
dat <- subset(dat, read_and_decide_time > 300)
dat <- subset(dat, read_and_decide_time < 2500)
nrow(dat)## [1] 17233
## "All" "Few" "Fewer than half" "Many"
## 2 48 48 50
## "More than half" "Most" "None" "Some"
## 50 50 3 1
##
## FALSE TRUE
## 8477 8756
##
## "All" "Few" "Fewer than half" "Many" "More than half" "Most" "None"
## false 89 2034 1798 1419 1724 1798 132
## true 10 1277 1605 1943 1726 1606 9
##
## "Some"
## false 13
## true 50
dat$qq <- as.numeric(factor(dat$quant))
dat <- subset(dat, qq %in% 2:6)
prop <- tapply(as.numeric(factor(dat$response)) - 1, list(dat$percent, dat$qq), mean, na.rm = T)qcols <- brewer.pal(5, "Dark2")
matplot(as.numeric(rownames(prop)), prop
, pch = 19, col = qcols
, xlab = "Percent", ylab = "Proportion 'true' responses"
, frame.plot = F)
abline(v = 50, lwd = 1.5, col = "darkgrey")
abline(h = .50, lwd = 1.5, col = "darkgrey")
legend(75, .7, legend = c("Few", "Fewer than half", "Many", "More than half", "Most")
, fill = qcols, bg = "white", box.col = "white")I now recode the responses to correspond to the expected direction of response. I will therefore flip TRUE and FALSE responses for the quantifiers few and fewer than half.
dat$resp <- case_when(
dat$qq %in% 4:6 ~ as.numeric(factor(dat$response)) - 1,
dat$qq %in% 2:3 ~ -as.numeric(factor(dat$response)) + 2
)prop <- tapply(dat$resp, list(dat$percent, dat$qq), mean, na.rm = T)
qcols <- brewer.pal(5, "Dark2")
matplot(as.numeric(rownames(prop)), prop
, pch = 19, col = qcols
, xlab = "Percent", ylab = "Proportion 'true' responses"
, frame.plot = F)
abline(v = 50, lwd = 1.5, col = "darkgrey")
abline(h = .50, lwd = 1.5, col = "darkgrey")
legend(75, .7, legend = c("Few", "Fewer than half", "Many", "More than half", "Most")
, fill = qcols, bg = "white", box.col = "white")x <- seq(0, 5, .01)
ysig <- dlnorm(x, .1, .5)
ysig2 <- 2 * x * dinvgamma(x^2, 2, .1)
plot(x, ysig, type = "l")
lines(x, ysig2, col = 2)ydelt <- rnorm(x, 0, .1)
M <- 10000
ysig2 <- rinvgamma(M, 2, .1)
ydelt <- rnorm(M, 0, .1)
prioreff <- rnorm(M, ydelt, sqrt(ysig2))
layout(matrix(1:4, ncol = 2, byrow = T))
hist(prioreff)
hist(ydelt)
hist(pnorm(prioreff), xlim = c(0, 1))
hist(pnorm(ydelt), xlim = c(0, 1))data {
int<lower=1> D; // #Dimensions of the model
int<lower=0> N; // #Observations
int<lower=1> I; // #Participants
int<lower=0,upper=1> y[N]; // Data 0,1
vector[N] cperc; // Centered Percentages
int<lower=1,upper=I> sub[N]; // participant vector
int<lower=0,upper=1> few[N]; // Few
int<lower=0,upper=1> fewer[N]; // Fewer than half
int<lower=0,upper=1> many[N]; // Many
int<lower=0,upper=1> more[N]; // More than half
int<lower=0,upper=1> most[N]; // Most
int<lower=0,upper=1> above[N]; // Above 50 percent?
}
parameters {
real delta[D]; // Means of betas
real<lower=0> sigma2[D]; // variance of betas
vector[D] beta[I]; // vectors of betas
real nu[D]; // Means of alphas
real<lower=0> sigma2alpha[D]; // variance of alphas
vector<lower=0>[D] alpha[I]; // vectors of alphas
vector<lower=0,upper=1>[D] gamma[I]; //vector of gammas
}
transformed parameters {
real<lower=0> sigma[D];
real<lower=0> sigmaalpha[D];
sigma = sqrt(sigma2);
sigmaalpha = sqrt(sigma2alpha);
}
model {
vector[N] mu;
vector[N] p;
delta ~ normal(0, 5);
sigma2 ~ inv_gamma(2, .2);
nu ~ normal(0, 5);
sigma2alpha ~ inv_gamma(2, .2);
for (i in 1:I)
beta[i] ~ normal(delta, sigma);
for (i in 1:I)
alpha[i] ~ lognormal(nu, sigmaalpha);
for (i in 1:I)
gamma[i] ~ beta(2, 20);
for (n in 1:N)
mu[n] = few[n] * (cperc[n] - beta[sub[n], 1]) / alpha[sub[n], 1] + fewer[n] * (cperc[n] - beta[sub[n], 2]) / alpha[sub[n], 2] + many[n] * (cperc[n] - beta[sub[n], 3]) / alpha[sub[n], 3] + more[n] * (cperc[n] - beta[sub[n], 4]) / alpha[sub[n], 4] + most[n] * (cperc[n] - beta[sub[n], 5]) / alpha[sub[n], 5];
for (n in 1:N)
p[n] = few[n] * gamma[sub[n], 1] + fewer[n] * gamma[sub[n], 2] + many[n] * gamma[sub[n], 3] + more[n] * gamma[sub[n], 4] + most[n] * gamma[sub[n], 5] + (1 - 2 * (few[n] * gamma[sub[n], 1] + fewer[n] * gamma[sub[n], 2] + many[n] * gamma[sub[n], 3] + more[n] * gamma[sub[n], 4] + most[n] * gamma[sub[n], 5])) * inv_logit(mu[n]);
y ~ bernoulli(p);
}
getDat <- function(dat){
# dat <- subset(dat, qq %in% 4:6)
D <- 5
N <- nrow(dat)
I <- length(unique(dat$workerid))
y <- dat$resp
cperc <- (dat$percent - 50) / 100
sub <- as.numeric(factor(dat$workerid))
few <- ifelse(dat$qq == 2, 1, 0)
fewer <- ifelse(dat$qq == 3, 1, 0)
many <- ifelse(dat$qq == 4, 1, 0)
more <- ifelse(dat$qq == 5, 1, 0)
most <- ifelse(dat$qq == 6, 1, 0)
above <- as.numeric(cperc > 0)
# res <- glm(y ~ cperc + few + fewer + many + more + most, family = binomial(link = "probit"),
# data = dat)
standat <- list(D = D, N = N, I = I, y = y, cperc = cperc, sub = sub,
few = few, fewer = fewer, many = many, more = more, most = most
, above = above)
return(standat)
}
myRunner <- function(standat, iter = 1000, warmup = 400, mod, nchains = 4, ...){
inits <- function(...){
list(delta = runif(standat$D, -.5, .5)
, sigma2 = runif(standat$D, 1, 1.5)
, beta = matrix(runif(standat$I * standat$D, -1, 1), nrow = standat$I, ncol = standat$D)
, nu = runif(standat$D, .1, 1)
, sigma2alpha = runif(standat$D, 1, 1.5)
, alpha = matrix(runif(standat$I * standat$D, .1, 1), nrow = standat$I, ncol = standat$D))
}
# inits <- list(c1 = inits)
fit <- sampling(mod, verbose=T,
data = standat,
iter = iter,
warmup = warmup,
chains = nchains,
cores = nchains,
# init = lapply(1:4, inits),
pars = c("delta", "nu", "beta", "alpha", "gamma", "sigma2", "sigma2alpha")
, ...)
return(fit)}
predat <- getDat(dat)rerun <- T
logmodfit <- myRunner(predat
, iter = 2500
, warmup = 750
, mod = logmod
, control = list(adapt_delta = .97, max_treedepth = 14)
, nchains = 6)
save(logmodfit, file = "outstudy1e.rda")hist(summary(logmodfit)$summary[,"n_eff"]
, breaks = 100
, main = ""
, xlab = "Number of effective samples")## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)
## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)
## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)
## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)
## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)
## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)
## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)
## [1] 10500 1086
pMeans <- colMeans(pEst)
I <- predat$I
pdelta1 <- pMeans[1]
pdelta2 <- pMeans[2]
pdelta3 <- pMeans[3]
pdelta4 <- pMeans[4]
pdelta5 <- pMeans[5]
pnu1 <- pMeans[6]
pnu2 <- pMeans[7]
pnu3 <- pMeans[8]
pnu4 <- pMeans[9]
pnu5 <- pMeans[10]
pbeta1 <- pMeans[10 + 1:I]
pbeta2 <- pMeans[10 + I + 1:I]
pbeta3 <- pMeans[10 + 2 * I + 1:I]
pbeta4 <- pMeans[10 + 3 * I + 1:I]
pbeta5 <- pMeans[10 + 4 * I + 1:I]
palpha1 <- pMeans[10 + 5 * I + 1:I]
palpha2 <- pMeans[10 + 6 * I + 1:I]
palpha3 <- pMeans[10 + 7 * I + 1:I]
palpha4 <- pMeans[10 + 8 * I + 1:I]
palpha5 <- pMeans[10 + 9 * I + 1:I]
pgamma1 <- pMeans[10 + 10 * I + 1:I]
pgamma2 <- pMeans[10 + 11 * I + 1:I]
pgamma3 <- pMeans[10 + 12 * I + 1:I]
pgamma4 <- pMeans[10 + 13 * I + 1:I]
pgamma5 <- pMeans[10 + 14 * I + 1:I]
cperc <- (1:100 - 50)/100
curve.calc <- function(x, p = cperc){
a <- (p - x[1])/x[2]
x[3] + (1 - 2* x[3]) * exp(a)/(exp(a) + 1)
}
pps1 <- curve.calc(c(pdelta1, exp(pnu1), mean(pgamma1)))
pps2 <- curve.calc(c(pdelta2, exp(pnu2), mean(pgamma2)))
pps3 <- curve.calc(c(pdelta3, exp(pnu3), mean(pgamma3)))
pps4 <- curve.calc(c(pdelta4, exp(pnu4), mean(pgamma4)))
pps5 <- curve.calc(c(pdelta5, exp(pnu5), mean(pgamma5)))
layout(matrix(1:6, ncol = 2, byrow = T))
plot(1:100, pps1, type = "l", lwd = 2, col = qcols[1], ylim = c(0,1)
, ylab = "Probability 'true' response", xlab = "Percent")
lines(1:100, pps2, lwd = 2, col = qcols[2])
lines(1:100, pps3, lwd = 2, col = qcols[3])
lines(1:100, pps4, lwd = 2, col = qcols[4])
lines(1:100, pps5, lwd = 2, col = qcols[5])
abline(h = .5, col = adjustcolor(1, .5))
abline(v = 50, col = adjustcolor(1, .5))
legend("bottomright", legend = c("Few", "Fewer than half", "Many", "More than half", "Most")
, fill = qcols, bty = "n")
res1 <- apply(cbind(pbeta1, palpha1, pgamma1), 1, curve.calc)
matplot(res1, type = "l", lty = 1, col = qcols[1], main = "Few")
res2 <- apply(cbind(pbeta2, palpha2, pgamma2), 1, curve.calc)
matplot(res2, type = "l", lty = 1, col = qcols[2], main = "Fewer than half")
res3 <- apply(cbind(pbeta3, palpha3, pgamma3), 1, curve.calc)
matplot(res3, type = "l", lty = 1, col = qcols[3], main = "Many")
res4 <- apply(cbind(pbeta4, palpha4, pgamma4), 1, curve.calc)
matplot(res4, type = "l", lty = 1, col = qcols[4], main = "More than half")
res5 <- apply(cbind(pbeta5, palpha5, pgamma5), 1, curve.calc)
matplot(res5, type = "l", lty = 1, col = qcols[5], main = "Most")layout(matrix(c(0, 1,1, 2,2, 0, 3,3,4,4,5,5), nrow = 2, byrow = T))
par(mgp = c(2, .7, 0), mar = c(3,3,1,1))
matplot(1:100, res1, type = "l", lty = 1, main = "Few", ylim = c(0,1), col = adjustcolor(1, .1)
, ylab = "Probability 'true' response", xlab = "Percent", frame.plot = F)
abline(h = .5, col = adjustcolor(1, .5))
abline(v = 50, col = adjustcolor(1, .5))
lines(1:100, pps1, lwd = 3, col = qcols[1])
matplot(1:100, res2, type = "l", lty = 1, main = "Fewer than half", ylim = c(0,1), col = adjustcolor(1, .1)
, ylab = "Probability 'true' response", xlab = "Percent", frame.plot = F)
abline(h = .5, col = adjustcolor(1, .5))
abline(v = 50, col = adjustcolor(1, .5))
lines(1:100, pps2, lwd = 3, col = qcols[2])
matplot(1:100, res3, type = "l", lty = 1, main = "Many", ylim = c(0,1), col = adjustcolor(1, .1)
, ylab = "Probability 'true' response", xlab = "Percent", frame.plot = F)
abline(h = .5, col = adjustcolor(1, .5))
abline(v = 50, col = adjustcolor(1, .5))
lines(1:100, pps3, lwd = 3, col = qcols[3])
matplot(1:100, res4, type = "l", lty = 1, main = "More than half", ylim = c(0,1), col = adjustcolor(1, .1)
, ylab = "Probability 'true' response", xlab = "Percent", frame.plot = F)
abline(h = .5, col = adjustcolor(1, .5))
abline(v = 50, col = adjustcolor(1, .5))
lines(1:100, pps4, lwd = 3, col = qcols[4])
matplot(1:100, res5, type = "l", lty = 1, main = "Most", ylim = c(0,1), col = adjustcolor(1, .1)
, ylab = "Probability 'true' response", xlab = "Percent", frame.plot = F)
abline(h = .5, col = adjustcolor(1, .5))
abline(v = 50, col = adjustcolor(1, .5))
lines(1:100, pps5, lwd = 3, col = qcols[5])make.cor.plot <- function(mat, ...){
M <- cor(mat)
corrplot(M, type = "upper", mar = c(0,0,2,0), ...)
}
(M1 <- cor(cbind("few" = pbeta1, "fewer" = pbeta2, "many" = pbeta3, "more" = pbeta4, "most" = pbeta5)))## few fewer many more most
## few 1.00000000 -0.11875910 0.30317693 0.00064865 -0.2267883
## fewer -0.11875910 1.00000000 -0.08713616 -0.20661865 0.1195840
## many 0.30317693 -0.08713616 1.00000000 0.06969840 0.1719174
## more 0.00064865 -0.20661865 0.06969840 1.00000000 -0.1694777
## most -0.22678827 0.11958399 0.17191740 -0.16947768 1.0000000
sds1 <- apply(cbind("few" = pbeta1, "fewer" = pbeta2, "many" = pbeta3, "more" = pbeta4, "most" = pbeta5), 2, sd)
means1 <- apply(cbind("few" = pbeta1, "fewer" = pbeta2, "many" = pbeta3, "more" = pbeta4, "most" = pbeta5), 2, mean)
(M2 <- cor(cbind("few" = palpha1, "fewer" = palpha2, "many" = palpha3, "more" = palpha4, "most" = palpha5)))## few fewer many more most
## few 1.00000000 0.05281385 0.07725865 -0.12947064 0.09840626
## fewer 0.05281385 1.00000000 -0.09618068 0.04532783 0.08532919
## many 0.07725865 -0.09618068 1.00000000 0.10239056 0.25590820
## more -0.12947064 0.04532783 0.10239056 1.00000000 0.26718043
## most 0.09840626 0.08532919 0.25590820 0.26718043 1.00000000
sds2 <- apply(cbind("few" = palpha1, "fewer" = palpha2, "many" = palpha3, "more" = palpha4, "most" = palpha5), 2, sd)
means2 <- apply(cbind("few" = palpha1, "fewer" = palpha2, "many" = palpha3, "more" = palpha4, "most" = palpha5), 2, mean)
(M3 <- cor(cbind("few" = pgamma1, "fewer" = pgamma2, "many" = pgamma3, "more" = pgamma4, "most" = pgamma5)))## few fewer many more most
## few 1.0000000 0.7542331 0.4082249 0.4606126 0.6538767
## fewer 0.7542331 1.0000000 0.4216081 0.3683092 0.5726053
## many 0.4082249 0.4216081 1.0000000 0.2359867 0.5168501
## more 0.4606126 0.3683092 0.2359867 1.0000000 0.3682729
## most 0.6538767 0.5726053 0.5168501 0.3682729 1.0000000
layout(matrix(c(1,1,2,2,3,3,0,4,4,5,5,0), ncol = 6, byrow = T))
make.cor.plot(cbind("Vagueness" = palpha1, "Threshold" = pbeta1, "Error" = pgamma1), title = "Few")
make.cor.plot(cbind("Vagueness" = palpha2, "Threshold" = pbeta2, "Error" = pgamma2), title = "Fewer")
make.cor.plot(cbind("Vagueness" = palpha3, "Threshold" = pbeta3, "Error" = pgamma3), title = "Many")
make.cor.plot(cbind("Vagueness" = palpha4, "Threshold" = pbeta4, "Error" = pgamma4), title = "More")
make.cor.plot(cbind("Vagueness" = palpha5, "Threshold" = pbeta5, "Error" = pgamma5), title = "Most")layout(matrix(c(1,2,3,3), ncol = 2, byrow = T), heights = c(.4, .6))
plot(pbeta4 * 100 + 50, pch = 19, col = adjustcolor(1, .5), main = "More than half", ylim = c(0, 100))
abline(h = 50, lwd = 1.5, col = "firebrick")
plot(pbeta5 * 100 + 50, pch = 19, col = adjustcolor(1, .5), main = "Most", ylim = c(0, 100))
abline(h = 50, lwd = 1.5, col = "firebrick")
plot(pbeta4 * 100 + 50, pbeta5 * 100 + 50
, pch = 19, col = adjustcolor(1, .5)
, ylab = "Threshold for 'most'"
, xlab = "Threshold for 'more than half'"
, xlim = c(40, 65)
, ylim = c(40, 65)
)
abline(0, 1, lwd = 1.5, col = "firebrick")mbeta4 <- pEst[,10 + 3 * I + 1:I]
mbeta5 <- pEst[,10 + 4 * I + 1:I]
mdiff <- mbeta4
for(i in 1:ncol(mdiff)){
mdiff[,i] <- apply(cbind(mbeta4[,i], mbeta5[,i]), 1, function(x) x[2] - x[1])
}
pCIs <- apply(mdiff, 2, quantile, probs = c(.025, .975))
pThresh <- apply(mdiff, 2, mean)indord <- order(pThresh)
plotCI(1:I, pThresh[indord]
, li = pCIs[1, indord], ui = pCIs[2, indord]
, pch = 21, pt.bg = "slateblue"
, ylab = "Most - More than half"
, xlab = "Participant")
abline(h = 0)Bayes factor assessing whether More is more than More than half for everyone.
## Prior probability
M <- 100000
s2 <- rinvgamma(M, 2, .1)
beta4 <- rnorm(M, 0, .1)
prioreff <- exp(pnorm(0, beta4, sqrt(s2), lower.tail = T, log.p = T) * I)
priorneg <- mean(prioreff)
## Posterior Probability
targ <- mdiff
good <- targ < 0 #evaluate their sign
all.good <- apply(good, 1, mean) #evaluate how often all theta estimates are postitive
postneg <- mean(all.good == 1) #Posterior probability of all theta_i being positive
# priorneg
# postneg
bfmomo <- postneg/priorneglayout(matrix(c(1,2,3,3), ncol = 2, byrow = T), heights = c(.4, .6))
plot(pbeta4 * 100 + 50, pch = 19, col = adjustcolor(1, .5), main = "More than half", ylim = c(0, 100))
abline(h = 50, lwd = 1.5, col = "firebrick")
plot(pbeta3 * 100 + 50, pch = 19, col = adjustcolor(1, .5), main = "Many", ylim = c(0, 100))
abline(h = 50, lwd = 1.5, col = "firebrick")
plot(pbeta4 * 100 + 50, pbeta3 * 100 + 50
, pch = 19, col = adjustcolor(1, .5)
, ylab = "Threshold for 'many'"
, xlab = "Threshold for 'more than half'"
, xlim = c(10, 60)
, ylim = c(10, 60)
)
abline(0, 1, lwd = 1.5, col = "firebrick")mbeta4 <- pEst[,10 + 3 * I + 1:I]
mbeta3 <- pEst[,10 + 2 * I + 1:I]
mdiffmany <- mbeta4
for(i in 1:ncol(mdiff)){
mdiffmany[,i] <- apply(cbind(mbeta3[,i], mbeta4[,i]), 1, function(x) x[2] - x[1])
}
pCIs <- apply(mdiffmany, 2, quantile, probs = c(.025, .975))
pThresh <- apply(mdiffmany, 2, mean)indord <- order(pThresh)
plotCI(1:I, pThresh[indord]
, li = pCIs[1, indord], ui = pCIs[2, indord]
, pch = 21, pt.bg = "slateblue"
, ylab = "More than half - Many"
, xlab = "Participant")
abline(h = 0)Bayes factor assessing whether Many is less than More than half for everyone.
## Prior probability
M <- 100000
s2 <- rinvgamma(M, 2, .1)
beta3 <- rnorm(M, 0, .1)
prioreff <- exp(pnorm(0, beta3, sqrt(s2), lower.tail = T, log.p = T) * I)
priorneg <- mean(prioreff)
## Posterior Probability
targ <- mdiffmany
good <- targ > 0 #evaluate their sign
all.good <- apply(good, 1, mean) #evaluate how often all theta estimates are postitive
postneg <- mean(all.good == 1) #Posterior probability of all theta_i being positive
# priorneg
postneg## [1] 0
As discussed in our meetings, we wanted to do a cluster analysis on the parameters from the probit model to assess whether there are distinct groups of individuals with certain parameter patterns. For example, we hypothesized that there might be a group of individuals who interpret the quantifier Most as More than half and therefore have a threshold close to 50% (or 0 in the current parameterization). Here, we do an extensive cluster analysis on 1. the parameters corresponding to the quantifier Most, 2. the parameters corresponding to the quantifier Many, and 3. an additional analysis on all parameter from all models.
The threshold, vagueness and guessing parameters for all participants were submitted to the cluster analysis. A determination of the optimal number of clusters consistently favored \(K = 2\) clusters.
mostpars <- cbind(pbeta5, palpha5, pgamma5)
layout(matrix(c(1,1,2,2,0,3,3,0), ncol = 4, byrow = T))
fviz_nbclust(mostpars, kmeans, method = "wss")Determining the optimal number of clusters.
Determining the optimal number of clusters.
gap_stat <- cluster::clusGap(mostpars, FUN = kmeans, nstart = 25,
K.max = 10, B = 50)
fviz_gap_stat(gap_stat)Determining the optimal number of clusters.
The figure below shows the response curves based on the cluster centers as parameters. As can be seen there is a clear shift in threshold between the two clusters, one perfectly centered at 50%, and one much higher. This is consistent with our hypothesis that there might be some individuals who interpret Most as more than half, and others interpret it as more than more than half.
res.clust <- kmeans(x = mostpars, centers = 2, nstart = 20)
curveclust <- apply(res.clust$centers, 1, curve.calc)
matplot(curveclust, type = "l", lty = 1, lwd = 2, col = qcols[5], main = "Most")
abline(v = 50)allpars <- cbind(pbeta1, palpha1, pgamma1
, pbeta2, palpha2, pgamma2
, pbeta3, palpha3, pgamma3
, pbeta4, palpha4, pgamma4
, pbeta5, palpha5, pgamma5)
draw.ind.clust <- function(pars, cluster, cols = qcols, main = ""){
par(mgp = c(2, .7, 0), mar = c(3,3,1,1))
layout(matrix(c(1, 4, 2, 3), ncol = 2, byrow = T))
n <- length(unique(cluster))
plot(pars[, 1], pars[, 2], col = qcols[cluster]
, xlab = "Threshold", ylab = "Vagueness"
, pch = 19)
legend("topleft", legend = paste("Cluster", 1:n), fill = cols[1:n], bty = "n")
plot(pars[, 1], pars[, 3], col = qcols[cluster]
, xlab = "Threshold", ylab = "Guessing"
, pch = 19)
legend("topleft", legend = paste("Cluster", 1:n), fill = cols[1:n], bty = "n")
plot(pars[, 2], pars[, 3], col = qcols[cluster]
, xlab = "Vagueness", ylab = "Guessing"
, pch = 19)
legend("topleft", legend = paste("Cluster", 1:n), fill = cols[1:n], bty = "n")
plot(0,0,axes = F, col = "white", ylab = "", xlab = "")
text(0,0, main, cex = 4)
}Next, we investigated individual differences for all quantifiers based on the cluster analysis. The plots for the quantifier Most illustrate that the threshold is consistently and clear-cut associated with cluster affiliation. There are no additional consistencies for the other quantifiers based on this cluster analysis.
The threshold, vagueness and guessing parameters for all participants were submitted to the cluster analysis. A determination of the optimal number of clusters was inconsistent with either \(K = 1\) or \(K = 2\) favored. We therefore ran a cluster analysis with \(K = 2\) clusters.
manypars <- cbind(pbeta3, palpha3, pgamma3)
layout(matrix(c(1,1,2,2,0,3,3,0), ncol = 4, byrow = T))
fviz_nbclust(manypars, kmeans, method = "wss")Determining the optimal number of clusters.
Determining the optimal number of clusters.
gap_stat <- cluster::clusGap(manypars, FUN = kmeans, nstart = 25,
K.max = 10, B = 50)
fviz_gap_stat(gap_stat)Determining the optimal number of clusters.
The figure below shows the response curves based on the cluster centers as parameters. As can be seen there is a clear shift in threshold between the two clusters, one perfectly centered at 50%, and one much lower. This is consistent with our hypothesis that there might be some individuals who interpret Many as more than half, and others interpret it as less than more than half.
res.clust.many <- kmeans(x = manypars, centers = 2, nstart = 20)
curveclust <- apply(res.clust.many$centers, 1, curve.calc)
matplot(curveclust, type = "l", lty = 1, lwd = 2, col = qcols[5], main = "Many")
abline(v = 50)Next, we investigated individual differences for all quantifiers based on the cluster analysis for quantifier Many. The plots for the quantifier Many illustrate that the threshold is consistently and clear-cut associated with cluster affiliation. In addition, individuals from cluster 1 — the cluster with thresholds lower than 50% — have the tendency for an increased vagueness parameter as well. This seems plausible: If the quantifier Many is interpreted as more than half, as seems to be the case for cluster 2, then there should be reduced vaguenuess associated with this quantifier.
In the second plot we assess whether the clustering on quantifier Many has a relationship with potential clustering for quantifier Few which is often thought of as the mirror quantifier of Many. However, the results are mostly inconsistent. There seems to be a small tendency that cluster 1 has a lower threshold on average than cluster 2, which seems plausible. However, this trend is mild at best. There are no other consistencies from this cluster analysis for the other quantifiers. Interestingly, the individuals who interpret Many as more than half do not seem to be the same individuals who interpret Most as more than half.
threshmost <- allpars[, 13]
threshmany <- allpars[, 7]
par(mgp = c(2, .7, 0), mar = c(3,3,1,1))
n <- 2
clustcols <- ifelse(res.clust$cluster == 1
, adjustcolor(qcols[res.clust.many$cluster], alpha.f = .4)
, adjustcolor(qcols[res.clust.many$cluster], alpha.f = 1))
plot(threshmost, threshmany
, col = clustcols
, xlab = "Threshold Most", ylab = "Threshold Many"
, pch = 19, cex = 1.5)
legend("bottomright", legend = paste0("Cluster [", c(1,1,2,2), ", ", c(1,2,1,2), "]")
, fill = ifelse(c(1,2,1,2) == 1
, adjustcolor(qcols[c(1,1,2,2)], alpha.f = .4)
, adjustcolor(qcols[c(1,1,2,2)], alpha.f = 1))
, bty = "n")The threshold, vagueness and guessing parameters for all participants for quantifiers Most and Many were submitted to the cluster analysis. A determination of the optimal number of clusters was inconsistent with either \(K = 1\) or \(K = 6\) favored. From a theoretical perspective given the two previous analyses we expected \(K = 4\) clusters, and therefore picked this number.
combpars <- cbind(allpars[, 7:9], allpars[, 13:15])
layout(matrix(c(1,1,2,2,0,3,3,0), ncol = 4, byrow = T))
fviz_nbclust(combpars, kmeans, method = "wss")Determining the optimal number of clusters.
Determining the optimal number of clusters.
gap_stat <- cluster::clusGap(combpars, FUN = kmeans, nstart = 25,
K.max = 10, B = 50)
fviz_gap_stat(gap_stat)Determining the optimal number of clusters.
The figure below shows the response curves for Many and Most based on the cluster centers as parameters. As can be seen there is a clear shift in threshold between the four clusters for Many, and between most of the clusters for Most. It almost seems like the 50%cluster for Many breaks down more, but the 50% cluster for Most is quite consistent.
res.clust.comb <- kmeans(x = combpars, centers = 4, nstart = 20)
layout(matrix(1:2, ncol = 2))
par(mgp = c(2, .7, 0), mar = c(3,3,1,1))
curveclust <- apply(res.clust.comb$centers[, 1:3], 1, curve.calc)
matplot(curveclust, type = "l", lty = 1, lwd = 2, col = qcols[5], main = "Many")
abline(v = 50)
curveclust <- apply(res.clust.comb$centers[, 4:6], 1, curve.calc)
matplot(curveclust, type = "l", lty = 1, lwd = 2, col = qcols[5], main = "Most")
abline(v = 50)threshmost <- allpars[, 13]
threshmany <- allpars[, 7]
par(mgp = c(2, .7, 0), mar = c(3,3,1,1))
n <- 2
plot(threshmost, threshmany
, col = qcols[res.clust.comb$cluster]
, xlab = "Threshold Most", ylab = "Threshold Many"
, pch = 19, cex = 1.5)
legend("bottomright", legend = paste("Cluster", 1:4)
, fill = qcols[1:4]
, bty = "n")gap_stat <- cluster::clusGap(allpars, FUN = kmeans, nstart = 25,
K.max = 10, B = 50)
fviz_gap_stat(gap_stat)res.clust.all <- kmeans(x = allpars, centers = 4, nstart = 25)
curveclust.few <- apply(res.clust.all$centers[, 1:3], 1, curve.calc)
curveclust.fewer <- apply(res.clust.all$centers[, 4:6], 1, curve.calc)
curveclust.many <- apply(res.clust.all$centers[, 7:9], 1, curve.calc)
curveclust.more <- apply(res.clust.all$centers[, 10:12], 1, curve.calc)
curveclust.most <- apply(res.clust.all$centers[, 13:15], 1, curve.calc)
matplot(curveclust.most, type = "l", lty = 1, lwd = 2, col = c(qcols[5], adjustcolor(qcols[5], alpha.f = .5)))
matplot(curveclust.few, type = "l", lty = 1, lwd = 2, col = c(qcols[1], adjustcolor(qcols[1], alpha.f = .5)), add = T)
matplot(curveclust.fewer, type = "l", lty = 1, lwd = 2, col = c(qcols[2], adjustcolor(qcols[2], alpha.f = .5)), add = T)
matplot(curveclust.many, type = "l", lty = 1, lwd = 2, col = c(qcols[3], adjustcolor(qcols[3], alpha.f = .5)), add = T)
matplot(curveclust.more, type = "l", lty = 1, lwd = 2, col = c(qcols[4], adjustcolor(qcols[4], alpha.f = .5)), add = T)
abline(v = 50)Looking at individual differences for all quantifiers based on cluster analysis with all quantifiers
res.clust.all <- kmeans(x = allpars, centers = 2, nstart = 25)
curveclust.few <- apply(res.clust.all$centers[, 1:3], 1, curve.calc)
curveclust.fewer <- apply(res.clust.all$centers[, 4:6], 1, curve.calc)
curveclust.many <- apply(res.clust.all$centers[, 7:9], 1, curve.calc)
curveclust.more <- apply(res.clust.all$centers[, 10:12], 1, curve.calc)
curveclust.most <- apply(res.clust.all$centers[, 13:15], 1, curve.calc)
matplot(curveclust.most, type = "l", lty = 1, lwd = 2, col = c(qcols[5], adjustcolor(qcols[5], alpha.f = .5)))
matplot(curveclust.few, type = "l", lty = 1, lwd = 2, col = c(qcols[1], adjustcolor(qcols[1], alpha.f = .5)), add = T)
matplot(curveclust.fewer, type = "l", lty = 1, lwd = 2, col = c(qcols[2], adjustcolor(qcols[2], alpha.f = .5)), add = T)
matplot(curveclust.many, type = "l", lty = 1, lwd = 2, col = c(qcols[3], adjustcolor(qcols[3], alpha.f = .5)), add = T)
matplot(curveclust.more, type = "l", lty = 1, lwd = 2, col = c(qcols[4], adjustcolor(qcols[4], alpha.f = .5)), add = T)
abline(v = 50)Looking at individual differences for all quantifiers based on cluster analysis with all quantifiers
Simple rule-based responding by setting a fixed boundary. The lines represent different levels of perceptual noise. Where this source of noise would come from in our experiment is completely unclear to me.
## One response
resp <- function(p) rbinom(length(p), 1, p)
## Curve simulation
sim.curve.rule <- function(boundary, perc.noise){
x <- seq(0, 1, .01)
xmat <- matrix(rep(x, 10000), nrow = length(x))
res <- apply(xmat, 1, rule.resp, noise = perc.noise, boundary = boundary)
return(colMeans(res))
}
## Response rules
## Deliberate rule-based respongind is perfect responding
rule.resp <- function(x, boundary, noise = 0){
x.perc <- rnorm(length(x), x, noise)
p <- ifelse(x.perc > boundary, 1, 0)
resp(p)
}
plot(sim.curve.rule(0.5, 0), type = "l", lwd = 2, col = "darkgray", ylab = "Prop true responses", xlab = "Percent")
lines(sim.curve.rule(0.5, 0.01), col = "slateblue", lwd = 2) # 1% perceptual noise
lines(sim.curve.rule(0.5, 0.05), col = "firebrick", lwd = 2) # 5% perceptual noiseNegation adds error/noise to the probability of responding correctly. This can either be uniform (makes most sense to me) or a function of the distance from the boundary.
## One response
resp <- function(p) rbinom(length(p), 1, p)
## Curve simulation
sim.curve.neg <- function(boundary, perc.noise, neg.noise, neg){
x <- seq(0, 1, .01)
xmat <- matrix(rep(x, 10000), nrow = length(x))
res <- apply(xmat, 1, rule.resp.neg
, noise = perc.noise
, boundary = boundary
, neg.noise = neg.noise
, neg = neg)
return(colMeans(res))
}
## Response rules
## Deliberate rule-based respongind is perfect responding
rule.resp.neg <- function(x, boundary, noise = 0, neg.noise = 0.05, neg = T){
x.perc <- rnorm(length(x), x, noise)
p <- ifelse(x.perc > boundary, 1, 0)
if(neg == T){
p.noise <- rnorm(length(p), p, neg.noise)
p <- ifelse(p.noise>1, 1, ifelse(p.noise<0, 0, p.noise))
}
resp(p)
}
plot(sim.curve.neg(0.5, 0.01, 0, F), type = "l", lwd = 2, col = "darkgray", ylab = "Prop true responses", xlab = "Percent")
lines(sim.curve.neg(0.5, 0.01, .1, T), col = "slateblue", lwd = 2) # 1% perceptual noise
legend("bottomright", legend = c("More than half", "Fewer than half"), fill = c("darkgray", "slateblue"), bty = "n")I had a bit of an issue finding a good way of adding noise as a function of distance from boundary. Have to play a bit more.
## One response
resp <- function(p) rbinom(length(p), 1, p)
## Curve simulation
sim.curve.neg <- function(boundary, perc.noise, neg.noise, neg){
x <- seq(0, 1, .01)
xmat <- matrix(rep(x, 100000), nrow = length(x))
res <- apply(xmat, 1, rule.resp.neg
, noise = perc.noise
, boundary = boundary
, neg.noise = neg.noise
, neg = neg)
return(colMeans(res))
}
## Response rules
## Deliberate rule-based respongind is perfect responding
rule.resp.neg <- function(x, boundary, noise = 0, neg.noise = 0.2, neg = T){
x.perc <- rnorm(length(x), x, noise)
p <- ifelse(x.perc > boundary, 1, 0)
z.noise <- dnorm(qnorm(x), sd = neg.noise) * neg.noise
dir.noise <- sample(c(-1, 1), size = length(x), replace = T)
if(neg == T){
p.noise <- p + z.noise * dir.noise
p <- ifelse(p.noise>1, 1, ifelse(p.noise<0, 0, p.noise))
}
resp(p)
}
plot(sim.curve.neg(0.5, 0.01, 0, F), type = "l", lwd = 2, col = "darkgray", ylab = "Prop true responses", xlab = "Percent")
lines(sim.curve.neg(0.5, 0.01, .2, T), col = "slateblue", lwd = 2) # 1% perceptual noise
legend("bottomright", legend = c("More than half", "Fewer than half"), fill = c("darkgray", "slateblue"), bty = "n")Vague quantifier implies sampling the boundary from a distribution for each trial new.
## One response
resp <- function(p) rbinom(length(p), 1, p)
## Curve simulation
sim.curve.vague <- function(boundary, perc.noise, bound.noise){
x <- seq(0, 1, .01)
xmat <- matrix(rep(x, 100000), nrow = length(x))
res <- apply(xmat, 1, resp.vague
, noise = perc.noise
, boundary = boundary
, bound.noise = bound.noise)
return(colMeans(res))
}
## Response rules
## Deliberate rule-based respongind is perfect responding
resp.vague <- function(x, boundary, noise = 0, bound.noise = 0.2){
x.perc <- rnorm(length(x), x, noise)
bound.perc <- rnorm(length(x), boundary, bound.noise)
p <- ifelse(x.perc > bound.perc, 1, 0)
resp(p)
}
plot(sim.curve.vague(0.5, 0.01, 0), type = "l", lwd = 2, col = "darkgray", ylab = "Prop true responses", xlab = "Percent")
lines(sim.curve.vague(0.5, 0.01, .1), col = "slateblue", lwd = 2) # 1% perceptual noise
legend("bottomright", legend = c("More than half", "Fewer than half"), fill = c("darkgray", "slateblue"), bty = "n")# Those people participated twice: 3,4,5,9,10,14,21,30,33,34,48,56,69,81,17,18
# Others: 1,25,29,43,46,58,70,72,74, 82.
dat <- read.csv("data/exp1-replication-v2-trials.csv")
exclude2 <- c(27, 76)
head(dat)
nrow(dat)
# Exclusion based on Sonia's code
## Participant-level
dat <- subset(dat, !(workerid %in% exclude2))
nrow(dat)
## Trial-level
dat <- subset(dat, read_and_decide_time > 300)
dat <- subset(dat, read_and_decide_time < 2500)
nrow(dat)
table(dat$quant, dat$workerid)[, 1]
# hist(dat$percent)
table(dat$percent>50)
table(dat$response, dat$quant)
dat$qq <- as.numeric(factor(dat$quant))
dat <- subset(dat, qq %in% 2:6)
prop <- tapply(as.numeric(factor(dat$response)) - 1, list(dat$percent, dat$qq), mean, na.rm = T)qcols <- brewer.pal(5, "Dark2")
matplot(as.numeric(rownames(prop)), prop
, pch = 19, col = qcols
, xlab = "Percent", ylab = "Proportion 'true' responses"
, frame.plot = F)
abline(v = 50, lwd = 1.5, col = "darkgrey")
abline(h = .50, lwd = 1.5, col = "darkgrey")
legend(75, .7, legend = c("Few", "Fewer than half", "Many", "More than half", "Most")
, fill = qcols, bg = "white", box.col = "white")I now recode the responses to correspond to the expected direction of response. I will therefore flip TRUE and FALSE responses for the quantifiers few and fewer than half.
dat$resp <- case_when(
dat$qq %in% 4:6 ~ as.numeric(factor(dat$response)) - 1,
dat$qq %in% 2:3 ~ -as.numeric(factor(dat$response)) + 2
)prop <- tapply(dat$resp, list(dat$percent, dat$qq), mean, na.rm = T)
qcols <- brewer.pal(5, "Dark2")
matplot(as.numeric(rownames(prop)), prop
, pch = 19, col = qcols
, xlab = "Percent", ylab = "Proportion 'true' responses"
, frame.plot = F)
abline(v = 50, lwd = 1.5, col = "darkgrey")
abline(h = .50, lwd = 1.5, col = "darkgrey")
legend(75, .7, legend = c("Few", "Fewer than half", "Many", "More than half", "Most")
, fill = qcols, bg = "white", box.col = "white")dat$above <- dat$percent > 50
prop.sub <- tapply(dat$resp, list(dat$workerid, dat$above), mean, na.rm = T)
excludeJ <- which(prop.sub[, 2] < .7 | prop.sub[, 1] > .3)
# New exclusions
## Participant-level
dat <- subset(dat, !(workerid %in% excludeJ))
nrow(dat)## [1] 16930
prop <- tapply(dat$resp, list(dat$percent, dat$qq), mean, na.rm = T)
qcols <- brewer.pal(5, "Dark2")
matplot(as.numeric(rownames(prop)), prop
, pch = 19, col = qcols
, xlab = "Percent", ylab = "Proportion 'true' responses"
, frame.plot = F)
abline(v = 50, lwd = 1.5, col = "darkgrey")
abline(h = .50, lwd = 1.5, col = "darkgrey")
legend(75, .7, legend = c("Few", "Fewer than half", "Many", "More than half", "Most")
, fill = qcols, bg = "white", box.col = "white")getDat <- function(dat){
# dat <- subset(dat, qq %in% 4:6)
D <- 5
N <- nrow(dat)
I <- length(unique(dat$workerid))
y <- dat$resp
cperc <- (dat$percent - 50) / 100
sub <- as.numeric(factor(dat$workerid))
few <- ifelse(dat$qq == 2, 1, 0)
fewer <- ifelse(dat$qq == 3, 1, 0)
many <- ifelse(dat$qq == 4, 1, 0)
more <- ifelse(dat$qq == 5, 1, 0)
most <- ifelse(dat$qq == 6, 1, 0)
above <- as.numeric(cperc > 0)
# res <- glm(y ~ cperc + few + fewer + many + more + most, family = binomial(link = "probit"),
# data = dat)
standat <- list(D = D, N = N, I = I, y = y, cperc = cperc, sub = sub,
few = few, fewer = fewer, many = many, more = more, most = most
, above = above)
return(standat)
}
myRunner <- function(standat, iter = 1000, warmup = 400, mod, nchains = 4, ...){
inits <- function(...){
list(delta = runif(standat$D, -.5, .5)
, sigma2 = runif(standat$D, 1, 1.5)
, beta = matrix(runif(standat$I * standat$D, -1, 1), nrow = standat$I, ncol = standat$D)
, nu = runif(standat$D, .1, 1)
, sigma2alpha = runif(standat$D, 1, 1.5)
, alpha = matrix(runif(standat$I * standat$D, .1, 1), nrow = standat$I, ncol = standat$D))
}
# inits <- list(c1 = inits)
fit <- sampling(mod, verbose=T,
data = standat,
iter = iter,
warmup = warmup,
chains = nchains,
cores = nchains,
# init = lapply(1:4, inits),
pars = c("delta", "nu", "beta", "alpha", "gamma", "sigma2", "sigma2alpha")
, ...)
return(fit)}
predat <- getDat(dat)rerun <- T
logmodfit <- myRunner(predat
, iter = 2000
, warmup = 850
, mod = logmod
, control = list(adapt_delta = .95, max_treedepth = 12)
, nchains = 6)
save(logmodfit, file = "outstudy2f.rda")hist(summary(logmodfit)$summary[,"n_eff"]
, breaks = 100
, main = ""
, xlab = "Number of effective samples")## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)
## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)
## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)
## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)
## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)
## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)
## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)
## [1] 6900 1191
pMeans <- colMeans(pEst)
I <- predat$I
pdelta1 <- pMeans[1]
pdelta2 <- pMeans[2]
pdelta3 <- pMeans[3]
pdelta4 <- pMeans[4]
pdelta5 <- pMeans[5]
pnu1 <- pMeans[6]
pnu2 <- pMeans[7]
pnu3 <- pMeans[8]
pnu4 <- pMeans[9]
pnu5 <- pMeans[10]
pbeta1 <- pMeans[10 + 1:I]
pbeta2 <- pMeans[10 + I + 1:I]
pbeta3 <- pMeans[10 + 2 * I + 1:I]
pbeta4 <- pMeans[10 + 3 * I + 1:I]
pbeta5 <- pMeans[10 + 4 * I + 1:I]
palpha1 <- pMeans[10 + 5 * I + 1:I]
palpha2 <- pMeans[10 + 6 * I + 1:I]
palpha3 <- pMeans[10 + 7 * I + 1:I]
palpha4 <- pMeans[10 + 8 * I + 1:I]
palpha5 <- pMeans[10 + 9 * I + 1:I]
pgamma1 <- pMeans[10 + 10 * I + 1:I]
pgamma2 <- pMeans[10 + 11 * I + 1:I]
pgamma3 <- pMeans[10 + 12 * I + 1:I]
pgamma4 <- pMeans[10 + 13 * I + 1:I]
pgamma5 <- pMeans[10 + 14 * I + 1:I]
cperc <- (1:100 - 50)/100
curve.calc <- function(x, p = cperc){
a <- (p - x[1])/x[2]
x[3] + (1 - 2* x[3]) * exp(a)/(exp(a) + 1)
}
pps1 <- curve.calc(c(pdelta1, exp(pnu1), mean(pgamma1)))
pps2 <- curve.calc(c(pdelta2, exp(pnu2), mean(pgamma2)))
pps3 <- curve.calc(c(pdelta3, exp(pnu3), mean(pgamma3)))
pps4 <- curve.calc(c(pdelta4, exp(pnu4), mean(pgamma4)))
pps5 <- curve.calc(c(pdelta5, exp(pnu5), mean(pgamma5)))
layout(matrix(1:6, ncol = 2, byrow = T))
plot(1:100, pps1, type = "l", lwd = 2, col = qcols[1], ylim = c(0,1)
, ylab = "Probability 'true' response", xlab = "Percent")
lines(1:100, pps2, lwd = 2, col = qcols[2])
lines(1:100, pps3, lwd = 2, col = qcols[3])
lines(1:100, pps4, lwd = 2, col = qcols[4])
lines(1:100, pps5, lwd = 2, col = qcols[5])
abline(h = .5, col = adjustcolor(1, .5))
abline(v = 50, col = adjustcolor(1, .5))
legend("bottomright", legend = c("Few", "Fewer than half", "Many", "More than half", "Most")
, fill = qcols, bty = "n")
res1 <- apply(cbind(pbeta1, palpha1, pgamma1), 1, curve.calc)
matplot(res1, type = "l", lty = 1, col = qcols[1], main = "Few")
res2 <- apply(cbind(pbeta2, palpha2, pgamma2), 1, curve.calc)
matplot(res2, type = "l", lty = 1, col = qcols[2], main = "Fewer than half")
res3 <- apply(cbind(pbeta3, palpha3, pgamma3), 1, curve.calc)
matplot(res3, type = "l", lty = 1, col = qcols[3], main = "Many")
res4 <- apply(cbind(pbeta4, palpha4, pgamma4), 1, curve.calc)
matplot(res4, type = "l", lty = 1, col = qcols[4], main = "More than half")
res5 <- apply(cbind(pbeta5, palpha5, pgamma5), 1, curve.calc)
matplot(res5, type = "l", lty = 1, col = qcols[5], main = "Most")layout(matrix(c(0, 1,1, 2,2, 0, 3,3,4,4,5,5), nrow = 2, byrow = T))
par(mgp = c(2, .7, 0), mar = c(3,3,1,1))
matplot(1:100, res1, type = "l", lty = 1, main = "Few", ylim = c(0,1), col = adjustcolor(1, .1)
, ylab = "Probability 'true' response", xlab = "Percent", frame.plot = F)
abline(h = .5, col = adjustcolor(1, .5))
abline(v = 50, col = adjustcolor(1, .5))
lines(1:100, pps1, lwd = 3, col = qcols[1])
matplot(1:100, res2, type = "l", lty = 1, main = "Fewer than half", ylim = c(0,1), col = adjustcolor(1, .1)
, ylab = "Probability 'true' response", xlab = "Percent", frame.plot = F)
abline(h = .5, col = adjustcolor(1, .5))
abline(v = 50, col = adjustcolor(1, .5))
lines(1:100, pps2, lwd = 3, col = qcols[2])
matplot(1:100, res3, type = "l", lty = 1, main = "Many", ylim = c(0,1), col = adjustcolor(1, .1)
, ylab = "Probability 'true' response", xlab = "Percent", frame.plot = F)
abline(h = .5, col = adjustcolor(1, .5))
abline(v = 50, col = adjustcolor(1, .5))
lines(1:100, pps3, lwd = 3, col = qcols[3])
matplot(1:100, res4, type = "l", lty = 1, main = "More than half", ylim = c(0,1), col = adjustcolor(1, .1)
, ylab = "Probability 'true' response", xlab = "Percent", frame.plot = F)
abline(h = .5, col = adjustcolor(1, .5))
abline(v = 50, col = adjustcolor(1, .5))
lines(1:100, pps4, lwd = 3, col = qcols[4])
matplot(1:100, res5, type = "l", lty = 1, main = "Most", ylim = c(0,1), col = adjustcolor(1, .1)
, ylab = "Probability 'true' response", xlab = "Percent", frame.plot = F)
abline(h = .5, col = adjustcolor(1, .5))
abline(v = 50, col = adjustcolor(1, .5))
lines(1:100, pps5, lwd = 3, col = qcols[5])make.cor.plot <- function(mat, ...){
M <- cor(mat)
corrplot(M, type = "upper", mar = c(0,0,2,0), ...)
}
(M1 <- cor(cbind("few" = pbeta1, "fewer" = pbeta2, "many" = pbeta3, "more" = pbeta4, "most" = pbeta5)))## few fewer many more most
## few 1.00000000 -0.1838015 0.19901978 0.04755858 0.06710636
## fewer -0.18380149 1.0000000 -0.15247098 0.14619987 0.12243073
## many 0.19901978 -0.1524710 1.00000000 -0.09941862 -0.10257846
## more 0.04755858 0.1461999 -0.09941862 1.00000000 0.04351189
## most 0.06710636 0.1224307 -0.10257846 0.04351189 1.00000000
sds1 <- apply(cbind("few" = pbeta1, "fewer" = pbeta2, "many" = pbeta3, "more" = pbeta4, "most" = pbeta5), 2, sd)
means1 <- apply(cbind("few" = pbeta1, "fewer" = pbeta2, "many" = pbeta3, "more" = pbeta4, "most" = pbeta5), 2, mean)
(M2 <- cor(cbind("few" = palpha1, "fewer" = palpha2, "many" = palpha3, "more" = palpha4, "most" = palpha5)))## few fewer many more most
## few 1.000000000 -0.006115202 -0.035453228 -0.001782599 -0.01184313
## fewer -0.006115202 1.000000000 0.001350216 -0.025687059 -0.03406211
## many -0.035453228 0.001350216 1.000000000 0.047463519 0.04414786
## more -0.001782599 -0.025687059 0.047463519 1.000000000 -0.02992121
## most -0.011843131 -0.034062108 0.044147862 -0.029921206 1.00000000
sds2 <- apply(cbind("few" = palpha1, "fewer" = palpha2, "many" = palpha3, "more" = palpha4, "most" = palpha5), 2, sd)
means2 <- apply(cbind("few" = palpha1, "fewer" = palpha2, "many" = palpha3, "more" = palpha4, "most" = palpha5), 2, mean)
(M3 <- cor(cbind("few" = pgamma1, "fewer" = pgamma2, "many" = pgamma3, "more" = pgamma4, "most" = pgamma5)))## few fewer many more most
## few 1.00000000 -0.02770359 0.05354589 -0.04673417 -0.03695528
## fewer -0.02770359 1.00000000 -0.08732881 0.15346394 -0.07835742
## many 0.05354589 -0.08732881 1.00000000 -0.12645892 -0.04835597
## more -0.04673417 0.15346394 -0.12645892 1.00000000 -0.06687674
## most -0.03695528 -0.07835742 -0.04835597 -0.06687674 1.00000000
layout(matrix(c(1,1,2,2,3,3,0,4,4,5,5,0), ncol = 6, byrow = T))
make.cor.plot(cbind("Vagueness" = palpha1, "Threshold" = pbeta1, "Error" = pgamma1), title = "Few")
make.cor.plot(cbind("Vagueness" = palpha2, "Threshold" = pbeta2, "Error" = pgamma2), title = "Fewer")
make.cor.plot(cbind("Vagueness" = palpha3, "Threshold" = pbeta3, "Error" = pgamma3), title = "Many")
make.cor.plot(cbind("Vagueness" = palpha4, "Threshold" = pbeta4, "Error" = pgamma4), title = "More")
make.cor.plot(cbind("Vagueness" = palpha5, "Threshold" = pbeta5, "Error" = pgamma5), title = "Most")layout(matrix(c(1,2,3,3), ncol = 2, byrow = T), heights = c(.4, .6))
plot(pbeta4 * 100 + 50, pch = 19, col = adjustcolor(1, .5), main = "More than half", ylim = c(0, 100))
abline(h = 50, lwd = 1.5, col = "firebrick")
plot(pbeta5 * 100 + 50, pch = 19, col = adjustcolor(1, .5), main = "Most", ylim = c(0, 100))
abline(h = 50, lwd = 1.5, col = "firebrick")
plot(pbeta4 * 100 + 50, pbeta5 * 100 + 50
, pch = 19, col = adjustcolor(1, .5)
, ylab = "Threshold for 'most'"
, xlab = "Threshold for 'more than half'"
, xlim = c(40, 65)
, ylim = c(40, 65)
)
abline(0, 1, lwd = 1.5, col = "firebrick")mbeta4 <- pEst[,10 + 3 * I + 1:I]
mbeta5 <- pEst[,10 + 4 * I + 1:I]
mdiff <- mbeta4
for(i in 1:ncol(mdiff)){
mdiff[,i] <- apply(cbind(mbeta4[,i], mbeta5[,i]), 1, function(x) x[2] - x[1])
}
pCIs <- apply(mdiff, 2, quantile, probs = c(.025, .975))
pThresh <- apply(mdiff, 2, mean)indord <- order(pThresh)
plotCI(1:I, pThresh[indord]
, li = pCIs[1, indord], ui = pCIs[2, indord]
, pch = 21, pt.bg = "slateblue"
, ylab = "Most - More than half"
, xlab = "Participant")
abline(h = 0)Bayes factor assessing whether More is more than More than half for everyone.
## Prior probability
M <- 100000
s2 <- rinvgamma(M, 2, .1)
beta4 <- rnorm(M, 0, .1)
prioreff <- exp(pnorm(0, beta4, sqrt(s2), lower.tail = T, log.p = T) * I)
priorneg <- mean(prioreff)
## Posterior Probability
targ <- mdiff
good <- targ < 0 #evaluate their sign
all.good <- apply(good, 1, mean) #evaluate how often all theta estimates are postitive
postneg <- mean(all.good == 1) #Posterior probability of all theta_i being positive
# priorneg
# postneg
bfmomo <- postneg/priorneglayout(matrix(c(1,2,3,3), ncol = 2, byrow = T), heights = c(.4, .6))
plot(pbeta4 * 100 + 50, pch = 19, col = adjustcolor(1, .5), main = "More than half", ylim = c(0, 100))
abline(h = 50, lwd = 1.5, col = "firebrick")
plot(pbeta3 * 100 + 50, pch = 19, col = adjustcolor(1, .5), main = "Many", ylim = c(0, 100))
abline(h = 50, lwd = 1.5, col = "firebrick")
plot(pbeta4 * 100 + 50, pbeta3 * 100 + 50
, pch = 19, col = adjustcolor(1, .5)
, ylab = "Threshold for 'many'"
, xlab = "Threshold for 'more than half'"
, xlim = c(10, 60)
, ylim = c(10, 60)
)
abline(0, 1, lwd = 1.5, col = "firebrick")mbeta4 <- pEst[,10 + 3 * I + 1:I]
mbeta3 <- pEst[,10 + 2 * I + 1:I]
mdiffmany <- mbeta4
for(i in 1:ncol(mdiff)){
mdiffmany[,i] <- apply(cbind(mbeta3[,i], mbeta4[,i]), 1, function(x) x[2] - x[1])
}
pCIs <- apply(mdiffmany, 2, quantile, probs = c(.025, .975))
pThresh <- apply(mdiffmany, 2, mean)indord <- order(pThresh)
plotCI(1:I, pThresh[indord]
, li = pCIs[1, indord], ui = pCIs[2, indord]
, pch = 21, pt.bg = "slateblue"
, ylab = "More than half - Many"
, xlab = "Participant")
abline(h = 0)Bayes factor assessing whether Many is less than More than half for everyone.
## Prior probability
M <- 100000
s2 <- rinvgamma(M, 2, .1)
beta3 <- rnorm(M, 0, .1)
prioreff <- exp(pnorm(0, beta3, sqrt(s2), lower.tail = T, log.p = T) * I)
priorneg <- mean(prioreff)
## Posterior Probability
targ <- mdiffmany
good <- targ > 0 #evaluate their sign
all.good <- apply(good, 1, mean) #evaluate how often all theta estimates are postitive
postneg <- mean(all.good == 1) #Posterior probability of all theta_i being positive
# priorneg
postneg## [1] 0
As discussed in our meetings, we wanted to do a cluster analysis on the parameters from the probit model to assess whether there are distinct groups of individuals with certain parameter patterns. For example, we hypothesized that there might be a group of individuals who interpret the quantifier Most as More than half and therefore have a threshold close to 50% (or 0 in the current parameterization). Here, we do an extensive cluster analysis on 1. the parameters corresponding to the quantifier Most, 2. the parameters corresponding to the quantifier Many, and 3. an additional analysis on all parameter from all models.
The threshold, vagueness and guessing parameters for all participants were submitted to the cluster analysis. A determination of the optimal number of clusters consistently favored \(K = 2\) clusters.
mostpars <- cbind(pbeta5, palpha5, pgamma5)
layout(matrix(c(1,1,2,2,0,3,3,0), ncol = 4, byrow = T))
fviz_nbclust(mostpars, kmeans, method = "wss")Determining the optimal number of clusters.
Determining the optimal number of clusters.
gap_stat <- cluster::clusGap(mostpars, FUN = kmeans, nstart = 25,
K.max = 10, B = 50)
fviz_gap_stat(gap_stat)Determining the optimal number of clusters.
The figure below shows the response curves based on the cluster centers as parameters. As can be seen there is a clear shift in threshold between the two clusters, one perfectly centered at 50%, and one much higher. This is consistent with our hypothesis that there might be some individuals who interpret Most as more than half, and others interpret it as more than more than half.
res.clust <- kmeans(x = mostpars, centers = 2, nstart = 20)
curveclust <- apply(res.clust$centers, 1, curve.calc)
matplot(curveclust, type = "l", lty = 1, lwd = 2, col = qcols[5], main = "Most")
abline(v = 50)allpars <- cbind(pbeta1, palpha1, pgamma1
, pbeta2, palpha2, pgamma2
, pbeta3, palpha3, pgamma3
, pbeta4, palpha4, pgamma4
, pbeta5, palpha5, pgamma5)
draw.ind.clust <- function(pars, cluster, cols = qcols, main = ""){
par(mgp = c(2, .7, 0), mar = c(3,3,1,1))
layout(matrix(c(1, 4, 2, 3), ncol = 2, byrow = T))
n <- length(unique(cluster))
plot(pars[, 1], pars[, 2], col = qcols[cluster]
, xlab = "Threshold", ylab = "Vagueness"
, pch = 19)
legend("topleft", legend = paste("Cluster", 1:n), fill = cols[1:n], bty = "n")
plot(pars[, 1], pars[, 3], col = qcols[cluster]
, xlab = "Threshold", ylab = "Guessing"
, pch = 19)
legend("topleft", legend = paste("Cluster", 1:n), fill = cols[1:n], bty = "n")
plot(pars[, 2], pars[, 3], col = qcols[cluster]
, xlab = "Vagueness", ylab = "Guessing"
, pch = 19)
legend("topleft", legend = paste("Cluster", 1:n), fill = cols[1:n], bty = "n")
plot(0,0,axes = F, col = "white", ylab = "", xlab = "")
text(0,0, main, cex = 4)
}Next, we investigated individual differences for all quantifiers based on the cluster analysis. The plots for the quantifier Most illustrate that the threshold is consistently and clear-cut associated with cluster affiliation. There are no additional consistencies for the other quantifiers based on this cluster analysis.
The threshold, vagueness and guessing parameters for all participants were submitted to the cluster analysis. A determination of the optimal number of clusters was inconsistent with either \(K = 1\) or \(K = 2\) favored. We therefore ran a cluster analysis with \(K = 2\) clusters.
manypars <- cbind(pbeta3, palpha3, pgamma3)
layout(matrix(c(1,1,2,2,0,3,3,0), ncol = 4, byrow = T))
fviz_nbclust(manypars, kmeans, method = "wss")Determining the optimal number of clusters.
Determining the optimal number of clusters.
gap_stat <- cluster::clusGap(manypars, FUN = kmeans, nstart = 25,
K.max = 10, B = 50)
fviz_gap_stat(gap_stat)Determining the optimal number of clusters.
The figure below shows the response curves based on the cluster centers as parameters. As can be seen there is a clear shift in threshold between the two clusters, one perfectly centered at 50%, and one much lower. This is consistent with our hypothesis that there might be some individuals who interpret Many as more than half, and others interpret it as less than more than half.
res.clust.many <- kmeans(x = manypars, centers = 2, nstart = 20)
curveclust <- apply(res.clust.many$centers, 1, curve.calc)
matplot(curveclust, type = "l", lty = 1, lwd = 2, col = qcols[5], main = "Many")
abline(v = 50)Next, we investigated individual differences for all quantifiers based on the cluster analysis for quantifier Many. The plots for the quantifier Many illustrate that the threshold is consistently and clear-cut associated with cluster affiliation. In addition, individuals from cluster 1 — the cluster with thresholds lower than 50% — have the tendency for an increased vagueness parameter as well. This seems plausible: If the quantifier Many is interpreted as more than half, as seems to be the case for cluster 2, then there should be reduced vaguenuess associated with this quantifier.
In the second plot we assess whether the clustering on quantifier Many has a relationship with potential clustering for quantifier Few which is often thought of as the mirror quantifier of Many. However, the results are mostly inconsistent. There seems to be a small tendency that cluster 1 has a lower threshold on average than cluster 2, which seems plausible. However, this trend is mild at best. There are no other consistencies from this cluster analysis for the other quantifiers. Interestingly, the individuals who interpret Many as more than half do not seem to be the same individuals who interpret Most as more than half.
threshmost <- allpars[, 13]
threshmany <- allpars[, 7]
par(mgp = c(2, .7, 0), mar = c(3,3,1,1))
n <- 2
clustcols <- ifelse(res.clust$cluster == 1
, adjustcolor(qcols[res.clust.many$cluster], alpha.f = .4)
, adjustcolor(qcols[res.clust.many$cluster], alpha.f = 1))
plot(threshmost, threshmany
, col = clustcols
, xlab = "Threshold Most", ylab = "Threshold Many"
, pch = 19, cex = 1.5)
legend("bottomright", legend = paste0("Cluster [", c(1,1,2,2), ", ", c(1,2,1,2), "]")
, fill = ifelse(c(1,2,1,2) == 1
, adjustcolor(qcols[c(1,1,2,2)], alpha.f = .4)
, adjustcolor(qcols[c(1,1,2,2)], alpha.f = 1))
, bty = "n")The threshold, vagueness and guessing parameters for all participants for quantifiers Most and Many were submitted to the cluster analysis. A determination of the optimal number of clusters was inconsistent with either \(K = 1\) or \(K = 6\) favored. From a theoretical perspective given the two previous analyses we expected \(K = 4\) clusters, and therefore picked this number.
combpars <- cbind(allpars[, 7:9], allpars[, 13:15])
layout(matrix(c(1,1,2,2,0,3,3,0), ncol = 4, byrow = T))
fviz_nbclust(combpars, kmeans, method = "wss")Determining the optimal number of clusters.
Determining the optimal number of clusters.
gap_stat <- cluster::clusGap(combpars, FUN = kmeans, nstart = 25,
K.max = 10, B = 50)
fviz_gap_stat(gap_stat)Determining the optimal number of clusters.
The figure below shows the response curves for Many and Most based on the cluster centers as parameters. As can be seen there is a clear shift in threshold between the four clusters for Many, and between most of the clusters for Most. It almost seems like the 50%cluster for Many breaks down more, but the 50% cluster for Most is quite consistent.
res.clust.comb <- kmeans(x = combpars, centers = 4, nstart = 20)
layout(matrix(1:2, ncol = 2))
par(mgp = c(2, .7, 0), mar = c(3,3,1,1))
curveclust <- apply(res.clust.comb$centers[, 1:3], 1, curve.calc)
matplot(curveclust, type = "l", lty = 1, lwd = 2, col = qcols[5], main = "Many")
abline(v = 50)
curveclust <- apply(res.clust.comb$centers[, 4:6], 1, curve.calc)
matplot(curveclust, type = "l", lty = 1, lwd = 2, col = qcols[5], main = "Most")
abline(v = 50)threshmost <- allpars[, 13]
threshmany <- allpars[, 7]
par(mgp = c(2, .7, 0), mar = c(3,3,1,1))
n <- 2
plot(threshmost, threshmany
, col = qcols[res.clust.comb$cluster]
, xlab = "Threshold Most", ylab = "Threshold Many"
, pch = 19, cex = 1.5)
legend("bottomright", legend = paste("Cluster", 1:4)
, fill = qcols[1:4]
, bty = "n")gap_stat <- cluster::clusGap(allpars, FUN = kmeans, nstart = 25,
K.max = 10, B = 50)
fviz_gap_stat(gap_stat)res.clust.all <- kmeans(x = allpars, centers = 4, nstart = 25)
curveclust.few <- apply(res.clust.all$centers[, 1:3], 1, curve.calc)
curveclust.fewer <- apply(res.clust.all$centers[, 4:6], 1, curve.calc)
curveclust.many <- apply(res.clust.all$centers[, 7:9], 1, curve.calc)
curveclust.more <- apply(res.clust.all$centers[, 10:12], 1, curve.calc)
curveclust.most <- apply(res.clust.all$centers[, 13:15], 1, curve.calc)
matplot(curveclust.most, type = "l", lty = 1, lwd = 2, col = c(qcols[5], adjustcolor(qcols[5], alpha.f = .5)))
matplot(curveclust.few, type = "l", lty = 1, lwd = 2, col = c(qcols[1], adjustcolor(qcols[1], alpha.f = .5)), add = T)
matplot(curveclust.fewer, type = "l", lty = 1, lwd = 2, col = c(qcols[2], adjustcolor(qcols[2], alpha.f = .5)), add = T)
matplot(curveclust.many, type = "l", lty = 1, lwd = 2, col = c(qcols[3], adjustcolor(qcols[3], alpha.f = .5)), add = T)
matplot(curveclust.more, type = "l", lty = 1, lwd = 2, col = c(qcols[4], adjustcolor(qcols[4], alpha.f = .5)), add = T)
abline(v = 50)Looking at individual differences for all quantifiers based on cluster analysis with all quantifiers
res.clust.all <- kmeans(x = allpars, centers = 2, nstart = 25)
curveclust.few <- apply(res.clust.all$centers[, 1:3], 1, curve.calc)
curveclust.fewer <- apply(res.clust.all$centers[, 4:6], 1, curve.calc)
curveclust.many <- apply(res.clust.all$centers[, 7:9], 1, curve.calc)
curveclust.more <- apply(res.clust.all$centers[, 10:12], 1, curve.calc)
curveclust.most <- apply(res.clust.all$centers[, 13:15], 1, curve.calc)
matplot(curveclust.most, type = "l", lty = 1, lwd = 2, col = c(qcols[5], adjustcolor(qcols[5], alpha.f = .5)))
matplot(curveclust.few, type = "l", lty = 1, lwd = 2, col = c(qcols[1], adjustcolor(qcols[1], alpha.f = .5)), add = T)
matplot(curveclust.fewer, type = "l", lty = 1, lwd = 2, col = c(qcols[2], adjustcolor(qcols[2], alpha.f = .5)), add = T)
matplot(curveclust.many, type = "l", lty = 1, lwd = 2, col = c(qcols[3], adjustcolor(qcols[3], alpha.f = .5)), add = T)
matplot(curveclust.more, type = "l", lty = 1, lwd = 2, col = c(qcols[4], adjustcolor(qcols[4], alpha.f = .5)), add = T)
abline(v = 50)Looking at individual differences for all quantifiers based on cluster analysis with all quantifiers
Originally, \(K_{ij} = 50\). But this value may be reduced after cleaning on the trial level.↩